74 research outputs found

    The word landscape of the non-coding segments of the Arabidopsis thaliana genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome sequences can be conceptualized as arrangements of motifs or words. The frequencies and positional distributions of these words within particular non-coding genomic segments provide important insights into how the words function in processes such as mRNA stability and regulation of gene expression.</p> <p>Results</p> <p>Using an enumerative word discovery approach, we investigated the frequencies and positional distributions of all 65,536 different 8-letter words in the genome of <it>Arabidopsis thaliana</it>. Focusing on promoter regions, introns, and 3' and 5' untranslated regions (3'UTRs and 5'UTRs), we compared word frequencies in these segments to genome-wide frequencies. The statistically interesting words in each segment were clustered with similar words to generate motif logos. We investigated whether words were clustered at particular locations or were distributed randomly within each genomic segment, and we classified the words using gene expression information from public repositories. Finally, we investigated whether particular sets of words appeared together more frequently than others.</p> <p>Conclusion</p> <p>Our studies provide a detailed view of the word composition of several segments of the non-coding portion of the <it>Arabidopsis </it>genome. Each segment contains a unique word-based signature. The respective signatures consist of the sets of enriched words, 'unwords', and word pairs within a segment, as well as the preferential locations and functional classifications for the signature words. Additionally, the positional distributions of enriched words within the segments highlight possible functional elements, and the co-associations of words in promoter regions likely represent the formation of higher order regulatory modules. This work is an important step toward fully cataloguing the functional elements of the <it>Arabidopsis </it>genome.</p

    Reverse Engineering of Computer-Based Navy Systems

    Get PDF
    The financial pressure to meet the need for change in computer-based systems through evolution rather than through revolution has spawned the discipline of reengineering. One driving factor of reengineering is that it is increasingly becoming the case that enhanced requirements placed on computer-based systems are overstressing the processing resources of the systems. Thus, the distribution of processing load over highly parallel and distributed hardware architectures has become part of the reengineering process for computer-based Navy systems. This paper presents an intermediate representation (IR) for capturing features of computer-based systems to enable reengineering for concurrency. A novel feature of the IR is that it incorporates the mission critical software architecture, a view that enables information to be captured at five levels of granularity: the element/program level, the task level, the module/class/package level, the method/procedure level, and the statement/instruction level. An approach to reverse engineering is presented, in which the IR is captured, and is analyzed to identify potential concurrency. Thus, the paper defines concurrency metrics to guide the reengineering tasks of identifying, enhancing, and assessing concurrency, and for performing partitioning and assignment. Concurrency metrics are defined at several tiers of the mission critical software architecture. In addition to contributing an approach to reverse engineering for computer-based systems, the paper also discusses a reverse engineering analysis toolset that constructs and displays the IR and the concurrency metrics for Ada programs. Additionally, the paper contains a discussion of the context of our reengineering efforts within the United States Navy, by describing two reengineering projects focused on sussystems of the AEGIS Weapon System

    Inferring causal molecular networks: empirical assessment through a community-based effort

    Get PDF
    It remains unclear whether causal, rather than merely correlational, relationships in molecular networks can be inferred in complex biological settings. Here we describe the HPN-DREAM network inference challenge, which focused on learning causal influences in signaling networks. We used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model. Using the phosphoprotein data, we scored more than 2,000 networks submitted by challenge participants. The networks spanned 32 biological contexts and were scored in terms of causal validity with respect to unseen interventional data. A number of approaches were effective, and incorporating known biology was generally advantageous. Additional sub-challenges considered time-course prediction and visualization. Our results suggest that learning causal relationships may be feasible in complex settings such as disease states. Furthermore, our scoring approach provides a practical way to empirically assess inferred molecular networks in a causal sense

    Inferring causal molecular networks: empirical assessment through a community-based effort

    Get PDF
    Inferring molecular networks is a central challenge in computational biology. However, it has remained unclear whether causal, rather than merely correlational, relationships can be effectively inferred in complex biological settings. Here we describe the HPN-DREAM network inference challenge that focused on learning causal influences in signaling networks. We used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model. Using the phosphoprotein data, we scored more than 2,000 networks submitted by challenge participants. The networks spanned 32 biological contexts and were scored in terms of causal validity with respect to unseen interventional data. A number of approaches were effective and incorporating known biology was generally advantageous. Additional sub-challenges considered time-course prediction and visualization. Our results constitute the most comprehensive assessment of causal network inference in a mammalian setting carried out to date and suggest that learning causal relationships may be feasible in complex settings such as disease states. Furthermore, our scoring approach provides a practical way to empirically assess the causal validity of inferred molecular networks

    Categorization of Programs Using Neural Networks

    Get PDF
    This paper describes some experiments based on the use of neural networks for assistance in the quality assessment of programs, especially in connection with the reengineering of legacy systems. We use Kohonen networks, or self-organizing maps, for the categorization of programs: programs with similar features are grouped together in a two-dimensional neighbourhood, whereas dissimilar programs are located far apart. Backpropagation networks are used for generalization purposes: based on a set of example programs whose relevant aspects have already been assessed, we would like to obtain an extrapolation of these assessments to new programs. The basis for these investigation is an intermediate representation of programs in the form of various dependency graphs, capturing the essentials of the programs. Previously, a set of metrics has been developed to perform an assessment of programs on the basis of this intermediate representation. It is not always clear, however, which parameters of the intermediate representation are relevant for a particular metric. The categorization and generalization capabilities of neural networks are employed to improve or verify the selection of parameters, and might even initiate the development of additional metric
    • …
    corecore